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ABSTRACT 



Context. In a probabilistic framework of the interpretation of the initial mass function (IMF), the IMF cannot be arbitrarily normalized 
to the total mass. At, or number of stars, N, of the system. Hence, the inference of M and N when partial infoiTnation about the studied 
system is available must be revised, (i.e., the contribution to the total quantity cannot be obtained by simple algebraic manipulations 
of the IMF). 

Aims. We study how to include constraints in the IMF to make inferences about different quantities characterizing stellar systems. It is 
expected that including any particular piece of information about a system would constrain the range of possible solutions. However, 
different pieces of information might be irrelevant depending on the quantity to be inferred. In this work we want to characterize the 
relevance of the priors in the possible inferences. 

Methods. Assuming that the IMF is a probability distribution function, we derive the sampling distributions of M and N of the system 
constrained to different types of information available. 

Results. We show that the value of M that would be inferred must be described as a probability distribution Ox[A1; m.^, N.„ ©//(A/^)] 
that depends on the completeness limit of the data, m^, the number of stars observed down to this limit, N-^, and the prior hypothesis 
made on the distribution of the total number of stars in clusters, 0^(A/^). 

Key words, stars: statistics — galaxies: stellar content — methods: data analysis 



1. Introduction 

The study of cluster dynamics and star formation relies on the 
knowledge of cluster masses and the amount of such mass trans- 
formed into stars, M. In most cases, we have partial information 
of the system, i.e., the observations of some stars in the cluster 
Such information is usually used in the inverse problem using 
the initial mass function (IMF) realization (see below) as a dis- 
tribution by number to make inferences about a theoretical prob- 
ability distribution function, the IM F (pjm) (Bouvier et alj 
Briceno et al. 2002t iLuhman et all I2003E lOliveira et alT 
Bavo et al. 2011). However, such information is not enough to 
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obtain cluster masses, and for some astrophysical studies it is re- 
quired to assume a 0(m) covering all the range of possible stellar 
masses to make inferences about global cluster properties (the 
direct problem). 

This use of the term IMF for both the distribution by number 
for the inverse problem of statistics and the probability distribu- 
tion function (pdf) for the direct problem can lead to different 
interpretations of the IMF itself and the results obtained from it 



(cf.lCervino et al ."20121 hereafter Paper I). In this work, follow- 
inglScalo ( 1986), we will adopt the pdf definitiorQ. 

The shape of the pdf and that of the distribution by number 
depend crucially on the size of the sample, that is, the number 
of stars N; for large N values, the two shapes tend to be simi- 
lar However, this similarity can mislead one into believing that 
the distribution by number is just a scaled-up version of the pdf, 
with N being the scale factor This would be very wrong since 
the physical meanings of both distributions are intrinsically dif- 
ferent; Paper I is dedicated to exploring the consequences of this 
essential difference. 

As a consequence, the standard methodology used to infer 
Ai values, which assumes the use of a correction factor for un- 
observed stars, is no longer valid. The main goal of this paper is 
to define a methodology based on the probabilistic approach of 
the IMF to obtain the total stellar mass M of an stellar sample 
from limited information on the sample itself. 

This task is far from trivial as we have to bridge different 
gaps according to the amount of unknown information. We start 
the discussion by making an inventory of possible scenarios that 
differ from each other according to the amount of information 
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' This definition implies that stellar masses are identically and inde- 
pendent distributed, we refer Paper I for more details. 
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available, with the aim of emphasizing how this affects the de- 
termination of Ai and N. Five such scenarios are: 

1. We know (from the IMF) the probability of a random star 
having a mass 111^^- equal to or larger than some given value 
nia, but no specific information on the particular cluster is 
known. 

2. We know (from observations) the number of stars A/' in a 
particular cluster; we also know (from the IMF) the expected 
number of stars with m > m-^. 

3. We know (from observations) the number of stars N in a 
particular cluster; we also know (from observations, too) that 
A^a stars have m > and the mass of such stars. 

4. We know that a particular cluster has A^a stars with m > m-^ 
and the mass of such stars from observations. 

5. We know that a particular cluster has A^a stars with m > 
and the mass of such stars, and we also know its total mass 
M. 

In scenario [1] which relies solely on knowledge of the IMF, 
we only know a theoretical probability that is independent of N 
and Ai. Consecuently, we have neither information on M nor on 
the actual value of Wstai-. 

In scenario |2] we know that the cluster is the result of sam- 
pling the IMF with N stars. With such information, we can com- 
pute the sampling distribution of Ai: that is, the distribution of 
possible values of Ai constrained by the value of N. In partic- 
ular, if A/' = 1 the distribution of total masses is the IMF itself, 
and if A/' — > 00, the distribution of A4 is a Gaussian, because of 
the central limit theorem. In all intermediate cases, the sampling 
distribution of AI at a given N is a more or less asymmetric 
function, which in turn implies that its mean value (AI) is not 
(in general) the same as its most probable value. 

Scenario[3]is a constrained version of the previous one. In the 
universe of all possible clusters with N stars, only those condi- 
tioned to have A^a stars with mass equal to or larger than ma can 
represent the cluster studied. The resulting distribution of pos- 
sible M, which is different from the previous sampling distri- 
bution, can be obtained by imposing an a posteriori condition 
on it. However, since we also know the mass of the A^a stars, an 
additional constraint must be applied 

Scenario|4]only constrains Ai to be equal to or larger than the 
contribution of the A^a stars. We cannot progress further unless 
we additionally assume a distribution of possible N values. If 
we do so, the resulting values of the mean total mass (At) and 
the most probable value will differ from those obtained under 
scenario|2] since in the present case N is not fixed but distributed 
and this affects the shape of the sampling distribution of M. 

In scenario |5] we know that the mass is At and that there 
are A^a stars with m > m^. The probability distributions that de- 
scribe such a cluster (such as, for example, the distribution of 
possible N values or of the A^a most massive stars that the clus- 
ter could host) correspond to the particular situation described in 
scenario]?] with the additional constraint of knowing At. 

From the above discussion, it is clear that the Ai derived 
in each of the above scenarios are different. Although all the 
resulting distributions are derived from the IMF, each of them 
is the result of including different pieces of information in the 
analysis: either the total number of stars N in the cluster (in 
scenarios]2]and]3]l, and its probability distribution (in scenarios]?] 
and]5]l or the presence of A'a stars above a given mass value (in 
scenarios]3]]ll and]5]l. Each case results in a different conditional 
probability distribution, which results in a different estimation of 
At. 



We note that relating the IMF with the corresponding sam- 
pling (and conditional) probability distributions is correct, given 
the set under study. We also note that we have an additional piece 
of information in such a set: stars are individual, discrete entities 
(i.e., N is a natural number). Such a condition must be fulfilled 
by any cluster in the Universe and must be included in all sce- 
narios as a restriction (even in cases where there is no explicit 
reference to N, as in scenario ]5]l. 

The preceding discussion boils down to the following point: 
as an underlying density distribution, the IMF describes neither 
a particular case nor any observational constraints (such as, e.g., 
the number of stars with a given mass observed in a particular 
cluster). Once an observational constraint is included (e.g., the 
fact that one star with known mass is present), conditional prob- 
abilities must be applied. Stated otherwise, the distribution that 
describes the universe of possible results (the IMF) is an a priori 
probability, and the probability constrained to the observed data 
is an a posteriori (conditional) probability. Confusing the a pos- 
teriori probability with the a priori probability is one of the most 
common flaws in hypothesis testing reasoning (this is also called 
the Prosecutor's fallacy: see Selman & Melnick2008, for a dis- 
cussion in a similar astrophysical context). In these situations, it 
is fundamental understand the true context of the question be- 
fore seeking an answer This has been done in the five scenarios 
discussed above. 

The structure of the paper is as follows: In Sect. ]2]we sum- 
marize the basic concepts required to use the IMF in a proba- 
bilistic framework (see Paper I for a more extended discussion). 
In Sect. ]3] we consider an ideal case in which all the stars in 
the system are known. Then we replace known information by 
unknowns to describe real situations where the use of the IMF 
or a related sampling distribution is required. Section shows 
the methodology to obtain At from partial information of the 
system in the scenarios presented above and their application to 
some astrophysical cases. We discuss some considerations about 
the use of prior information in Sect. ]5] Our conclusions are de- 
scribed in Sect. ]6] 



2. Formal probabilistic formulation 

The basis of the probabilistic formulation has been presented in 
Paper I. We refer to that paper for more details and include here 
only the basic formulae needed for this work. 

1 . The IMF, (p(m) - AN I dm, is a probability density function 
(pdf), which can be integrated over a given mass range to de- 
rive the probability of finding a star in that range. The mass 
limits miow and mup are given by stellar theory and must ful- 
fill (p(m)dm - 1 ; that is, we are certain that any possible 
star has a mass between miow and m^p. 
The probability of a random star having a mass lower than a 
given value is given by 

p{m < m^) — I (f){m)dm. (1) 

In this work, the integrals over the IMF will always be read 
as equal to or larger than the lower limit and lower than the 

upper limit. 

In thi s work we employ the Kroupa IMF jKroupal l200ll 
l2002h as used in IWeidner & Kroupal (l2006l) . with mup = 
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I2OM0, miow = O.OIMq, and a correction of k' - 1/3 for 
stars with mass lower than 0.08 mJ^ 

Different observational scenarios can be described by adding 
constraints to the IMF. For instance, we may explicitly in- 
clude a limit on Wa and compute probabilities for stars with 
masses lower than 111^. In this case, we must define an a pos- 
teriori pdf related to the IMF that includes such a condition: 



0(m|m < ma) = 



(p(m) H(ma - m) 
p(m < nis) 



(2) 



where H(ma - m) is the Heaviside functiorQ, which ensures 
that no star equal to or larger than m-^ can be present in the 
cluster. We note that (p{m\m < m.^) is a pdf also. The mean 
mass of such distribution is 



{m\m < ma) — 



J^"'' m (p{m) H(ma — m)dm 
p(m < ma) 



(3) 



3. The pdf describing ensembles with a total number of stars 
N (formally, a sampling distribution conditioned to have N 
stars) can be calculated as successive convolutions of the cor- 
responding pdf for one star. For instance, the pdf for the to- 
tal mass, OaiCAIIAO, is the result of c onvolving the IMF N 
times in a recursive convolutio n (see ICervino & Luridiana 
l2006l:ISelman & Melnickll2008l) : 



N 



^m{M\N) - (pirn) <gi (/)(m) (gi .... ® (p(m) . 



(4) 



The same procedure applies to any other pdf. The mean value 
of the resulting distribution is 



{M\N} - N y. <m) - N X \ m (p(m) dm. 



Mean values of constrained distributions when sampled with 
N stars are obtained in a similar way. 



of known data from a particular cluster (i.e., a particular IMF 
realization) and the use of probability distributions. 

We sort the stars in ascending order according to their mass. 
We use a subindex in brackets to denote that such operation has 
been performed, so m, is the i-th random sampled element and 
mr,i is the i-th element after sorting the data. We also assume that 



with a value lower 



the most massive star has a mass m°^-^ 
than mup. 

In addition, we assume that we have A^a stars equal or more 
massive than an arbitrary value ma, so that m[yv-A',,] < ma, and 
iiilN-N„+i] ^ 'Wa- We express the total number of stars and total 
mass as a function of the A^a set. It can be described as 



- Z 



- Z 



„,obs c 

Oij, 



(6) 



where 6ij is the Kronecker delta. The total mass in the ensemble 



M = Ma + ^ m*^,-,,-. 



(7) 



These two equations, rewritten in terms of frequencies and mean 
stellar mass in the complete sample, are, respectively 



^ - V ^ 



N 



N 



(8) 



and 



(5) <m) = 



N N, 



N 



(9) 



Multiplying (m) by N produces the value of hi. However, 
we note that conceptually 



3. The trade-off between knowledge and probability 

Once we have laid down the basic framework, we apply it to our 
science case: the estimation of the total mass Al of a cluster from 
a partial knowledge of its stellar content. To do that we progres- 
sively replace known information by unknowns to describe real 
situations; however, the following items here are not directly re- 
lated to the scenarios quoted in the Introduction (we will come 
back to such scenarios in Sect.©. 

3.1. Case study 1: Everything is known 

We begin with an ideal observational point of view, where we 
suppose that we know the masses m"*"^ of every one of the N 
stars in a cluster. Thus, the total mass, Al, is also known. In this 
hypothetical case, it is not required to use the IMF. However, 
this exercise allows us to illustrate the trade-off between the use 

^ Such a correction was not used in Pa per I. However , it is the 
parametrization used in the set of clusters bv lKirk & Myers! (1201 Ih we 
use in this work for comparing methodologies. 

^ We use here the Heaviside function as a distribution to define the 
domain of (f>(m) including constraints. In this situation the value of H(0) 
is not defined, but it is assigned a posteriori to be consistent with the 
convention used in the integral limits. In the case of Eq.|2l H(0) = 0. 



M = N X (Wi) N X (m) = (Al) , (10) 

since (m) (the sample mean) does not coincide with the the mean 
stellar mass obtained from the IMF, (m) (the population mean). 
That is, (m) is an estimate of (m) obtained from a sample of 
N stars, so, formally, (m) - {th\N). In the following, we use 
the m symbol to denote an estimate of m. In the computation of 
this estimate, the value of N must be taken into consideration, 
although we will not write it explicitly in order to simplify the 
notation. 

3.2. Case study 2: The total number of stars and fhe mass of 
tiie most massive stars are known 

In this case we have less information than in the previous case 
since we only know m"!^'* with / - {N - Ng^ + 1, . . . N], stellar 
masses, and N. But we had seen that estimates obtained from 
actual values, such as (m) can be related to values obtained from 
the IMF. So we can replace these estimates with 

/=i ^ '-^^ 
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Thus, although we cannot know the actual M value, we can 
at least obtain average values given different sets of constraints: 



{M\m°^{ > mf" i^N-N^ + l,...N- N)^ 
M., + {N- N^) {m\m < mf") . 



(11) 



This illustrates the trade-off between observed frequency dis- 
tributions and probability: when we use a probability distribu- 
tion, we cannot have access to the actual values, but we can have 
access to the distribution of possible values and the mean value 
of all these possible values. In this case we are using the es- 
timates argument in the opposite direction to a statistical anal- 
ysis, i.e., we are making the assumption that all the stars are 
distributed following the IMlQ and using it to make inferences 
about related quantities. 

3.3. Case study 3: Only the mass of the more massive 
stars is known 

Observations of clusters in many cases only allow characteriza- 
tion of the A^a most luminous stars with masses m°^'^, i - [N - 
A^a + 1, . ■ . A/"). They also lack a proper census that inclu des the 
lowest luminous members (see Bayo et al 2011; Kirk & M yersj 
1201 iL as counterexamples). In this case, it is more difficult to 
obtain estimates, since we can not define a frequency of N^. 
Therefore, is the following reasoning valid? 



p(m\m < mf'^) 



N-N, 



p(m\m < mf'^) 



N 



p(m\m > tnf") ^ ^ p(m\m > mf '). 



3.3.1 . When is the correspondence N 
valid? 



A^a/ p{m\m > ma) 



We divide the IMF in, e.g., k + I mass intervals, where the mass 
interval containing the lower masses, e.g., the comprises the 
N - NaOf unknown stars with mass lower than m^. Each of the 
remaining / mass interval, which belong to [mj°™, m"'') contain n,- 
starfl so that J^Jlj = A^a. The probability of having a star in a 
given mass interval is given by the integration of the IMF over 
such a mass interval, pi(m) = p(m e [m'"*, m"'')). We assume 
that the cluster is a random realization of the IMF for N stars, 
so the probability of having the N stars distributed in the k + I 
intervals with n, stars in the i-th interval for a given (unknown) 
number of stars N is given by the multinomial distributioiJ3 



^N^Nam = Pim > ma,^n/ = A^al^ 



/■=! 



■pirn < ma)^-^=' f] piimf 



* Hence, it includes the A'a subset with known stellar masses. 

^ In this case, we are distributing the known N-^ stars in k intervals 
and not using the particular m values of known stars. Such intervals can 
be arbitrary and must only obey the condition 2*=i = A'a. So the index 
i here refers to the interval, not to a particular stellar mass . 

^ Since A/^ is a discrete quantity, their pdf directly provides the prob- 
ability. In addition, the distribution can be also expressed as a binomial 
distribution with A(p;,;j;)A'a! = p{m > m;,)"" = (1 - p{m < m.^))'^'. 



= A(pi,ni) 



Nl 



(N-N,y. 



p(m < m-J 



(12) 



where we have included in A(/?,, «,) all the known information. 
However, we are interested in the complementary distribution 
<S>;^{N\Na), which must be obtained using the Bayes' theorem 
(see, e.g.. Paper I). Assuming that the possible values of N, 
O;v(A0 follow a discrete power-law probability distribution with 
exponent -/3, we obtain 



<l>^(^|A^a) = A' 



iN-N.,y. 



(13) 



where A' is a normalization value that includes all the known 
terms and is independent of A(p,, «,) since A(p,, n,) is canceled 
out by the normalization constant. Thus, the inference about the 
total number of stars only depends on the number of stars A^a 
more massive than a certain observational value m^, and not on 
the particular distribution of such stars in different mass bins. 

This result might seem surprising: the knowledge of the 
masses of particular stars does not provide additional informa- 
tion on (the number) of unobserved onefl It can be argued that, 
for example, an excess or deficit of the observed number of stars 
in a given mass range constrains the total number of stars from 
being compatible with sampling effects. However, such argu- 
ments are valid for IMF inferences (which IMF shape is more 
probable, given some observations?), i.e., the problem of obtain- 
ing the IMF. 

In our case, a given IMF is assumed and the observations are 
a random realization of it. The particular observed set may be a 
highly improbable (but still possible) realization of the assumed 
IMF. Nevertheless, whatever its a priori probability of happen- 
ing, it has actually happened, and thus a posteriori probabilities 
must be obtained by taking this fact into consideration. In ad- 
dition, since stellar masses are random variables (cf. Paper I), 
the occurrence of having a star (or a set of stars) with a given 
particular mass has no impact on the individual masses of the 
remaining stars. 

The mode of 07v(A/'|A^a), N™"^^ is obtained by equating to 
zero its first derivative with respect to N, which, for large N 
valued , yields 



mode 



In p(m < TOa) ' 



(14) 



where we used the Stirling approximation of factorial functions 
and a first-order Taylor approximation of logarithm functions 
valid for [i + Q. In the case of a flat distribution with /3 - 0, 
the approximate mode of the distribution is obtained by solving 



pim < ma) = (l - 



(15) 



which coincides with the estimation of the probability p(m < 
ma) for known A^a and N. This means that Ns,lp(m\m > ma) 
provides the mode N™^" of (Dyv(A^IA^a) assuming a flat <t>/<^(N) 
distribution. However, we know that the initial cluster mass 
function (ICMF, Om(A() ) is not flat jLada & Ladal 120031: 



' However, we note that such information is still relevant for the com- 
putation of M: the individual masses of stars more massive than OTj, 
provide the amount of mass in the mass range, Ma. 

In practical terms it implies large values. Actually, (^/^(N\N.,) is 
a discrete distribution, hence not derivable, but the formulae provide a 
reasonable value as far as the Stirling approximation of factorial func- 
tions are valid, i.e., N, N.^, and N - N.^ larger than 15. 
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name 


M 




N 


N.^ 




log PnoAN^lN) 


Tau. 














#1 


10.6 


7.6 


20 


8 


0.95 


-1.06 


#2 


15.5 


11.5 


30 


12 


0.96 


-1.55 


#3 


8.1 


5.9 


19 


8 


0.74 


-1.20 




10 1 
LL. 1 


'7 1 7 




1 s 
i o 


1 9n 
1 .zu 


7 

- /.JO 


#3 


0.2 


O.U 


1 /I 
14 


1 n 
lU 


n on 
0.8U 


-j.yi 


#D 


inn 
1 /. / 


14. / 


31 


14 


1.05 


-z.jy 


#/ 


lo. 1 


1 Q n 
ij.y 


24 


13 


1.0/ 


-j.zU 


44-Q 

m 




lU.Z 


lo 


J 


Z.05 


-U.j4 


^iiciu ) 


OO. J 


/Z.J 


1 7zL 


/ J 


n QQ 


-lU. 


Chal 














#1 


3.7 


1.7 


12 


2 


0.85 


0.00 


TrZ 




9S Q 


yo 


ZU 


1 

L.JV 




W2 

Wi 


O 1 "7 


1 A /I 

lo.4 


4j 


1 A 


1 m 
1.03 


1 AO 

-i.oy 


(iieia) 


42. D 


1 


OA 

oO 


Z i 


111 
1.11 


1 <Q 
-I.JO 


Lup.3 
#1 


1 R 9 

1 o.z. 




36 


\ \ 


1.22 


-0 ft? 


(field) 


18.1 


12.9 


34 


11 


1.17 


-0.76 


IC348 














#1 


111.9 


87.6 


186 


65 


1.35 


-5.43 


#2 


3.1 


0.5 


11 


1 


0.53 


-0.08 


(field) 


78.2 


51.7 


166 


35 


1.48 


-0.08 



Table 1. Data form stellar associations bv lKirk&Mversl(l2Qllh . 
We show the total mass (Al), the mass into stars more mas- 
sive than OTa (Ma), the total number of stars (AO, the number 
of stars more massive than m-^ (A^a), the estimation of the mean 
mass for stars more massive than {(m^) = {m\m > ma)), and 
the logarithm of the probability that a cluster with N stars fol- 
lowing the assumed IMF would have A^a stars with mass equal 
or larger than m-^ divided by the maximum of such distribution 
(log PnoriN-AN)). The m.^ value is set to 0.5 Mq. 



|PiskunovetaDl2008h and that it must be somehow related to 
^n(^) (cf. Eq. m, although we are not able to establish its 
functional form. Whatever equation we use to obtain A/'™"'''', we 
are left in the uncomfortable situation of mixing a mean value 
({m\m < ma)) with a mode value yv™°'i'= to obtain an inference 
about At. However, we have no means to give a meaning of this 
inference: Is it a mean, a mode, on any other parameter? 

This suggests that it is better to use the resulting prob- 
ability distribution of NiN^) and obtain the corresponding 
^m[M\Ni„<S>/m(N)] to make inferences about M. In addition, 
this way to proceed is in agreement with the International 
Organization for Standardization (ISO), which recommends ex- 
pressing the uncertainty in the results as a pdfl 

4. Use cases 

Having presented the probabilistic framework and the related in- 
formation trade-off, we can compare the probabilistic methodol- 
ogy and the distribution by number methodology to obtain Al 
and N. For compa rison purposes, we have used the data from 
iKirk & MversI (1201 1) to illustrate the differences. The data con- 
tain the observed masses for individual stars belonging to 14 
young stellar groups in four different regions. They also con- 
tain the stellar mass of field stars in the four analyzed regions. 
Table [1] shows the identifier of the cluster along with the val- 
ues of Al, Ma, N, Na, and the estimation of the mean mass, 
(;i?a) = (m|m > ma) from the census of stars with m > ma. 

' Guide to the Expression of Uncertainty in Measurement 
(International Organization for Standardization, Switzerland, 1995) 
|http : //www ■ bipm . org/en/publications/guides/gum. html , 



iKirk & MversI (1201 ih state that their mass estimates are valid 
with a relative error of 50%; in this work we assume that the tab- 
ulated values can be taken at face value without errors. They also 
state that their census is complete at a 90% level down to 0.08 
Mq; hence their total mass estimation would be actually a lower 
limit of the real value. Whatever the case, we assume again that 
the Al values obtained from the data can be use at face value 
without errors. Finally, we assume that the data is complete at 
100% down to Wa = O.5M0. We use this ma value to illustrate 
the M inference in scenarios 2, 3, and 4 in the introduction. 

As reference, the IMF used here produces (m) = O.46M0, 
<m|m > ma) = 1.64Mo, and p{m\m > m^) - 0.19. We can make 
a first-order test about the compatibility of the cluster data with 
the assumed IMF by computing the probability of having a given 
A^a number of stars with mass larger than ma in a cluster with N 
stars. It can be done by dividing the IMF into two bins, [miow, ma) 
and [ma, mup), and using the probability in each bin to define a 
binomial distribution. The logarithm value of the resulting prob- 
abilities normalized to the maximum value of the distribution, 
log Pnoi(Na\N), are shown in column 7 of Table fllFi In this test 
we see that our hypothesis about the validity of the used IMF 
in all the associations is actually questionable for the stars in 
Taurus field, Taurus #4, and IC348 #1, and would produce some 
problems in the analysis of Taurus #5, #7, and #6. 

4. 1 . Distribution-by-number methodoiogy 

The distribution-by-number methodology considers that the IMF 
can be used with an arbitrary normalization. Such normalization 
can be either to TV or Al, which implies multipling (p(m) by N 
or Ml (m), respectively. In addition, it is assumed that N and M 
are deterministically related by the relation 

M^Nx{m). (16) 

This provides A4 in all the cases where N is given and vice 
versa. We can include additional information like Ma and A^a to 
make alternative inferences about M. Following the procedure 
of this paper, the most information is included using a formula 
similar to Eq.fTTI 

A( = Ma + (AT - A^a) {m\m < m^) . (17) 

However, we can choose to use only partial information, 
such as the contribution of Ma to the total budget. Then the ratio 
Al/Ma is constant, and is equal to the ratio of mx0(m) integrated 
in the whole range, (m), over the same function integrated in the 
OTa, mup range. As a result, Al is: 

MaX<m) 
J^^^ m (f>{m) dm 

On the other hand, we could choose to use the contribution 
of A^a to the total budget. Then the ratio A/'/A^a is constant and is 

We note that a comparison of p{m\m > m.J and p(m\m > m.^) does 
not produce a valid test about IMF compatibility, since the importance 
of the possible deviations depends on how many stars are in the sample 
(size of sample effects). Interestingly, IC348 #1, which deviates from 
the IMF in this test, is the system used as an example by Kirk & MversI 
(201 1) to argue that their systems follows a Kroupa IMF (their Fig. 6). 
Although the shape of the IMF realization in IC348#1 would look like 
a Korupa IMF, the deviations (fluctuations) observed are actually too 
large compared with the expected ones taking into account the number 
of stars in the system. 
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equal to the ratio of (p(m) integrated in the whole range (that is, 
the unity) over the (p{m) integrated in the m^, m^p range. Since 
M-Nx (m), M is 



A^a X <ot) 

XT'" <P("^'^ dm 



(19) 



We could also choose to use just Ma and A^a values without 
the information about N (similar to Eq. [T7]with some additional 
algebraic manipulation): 



J""" (p(m) dm 
M-M-i+N-i {m\m < m.^) """" 



(20) 



as 

Ma 

A^a 



r™"'' d)(m) dm 

Eguations fTSl [T9l and|20]produce an equal value of M as far 
= (m|m > ma) {m\m > m^) , 



and they will produce a result similar to Eq.fTSjandfTTlas far as, 
additionally. 



p{m\m > ma) = ^ 
N 



p(m\m > ma). 



In relation to the scenarios presented in the Introduction, sce- 
nario 2 (only N is observed) is described by Eq. [16] Scenario 3 
(N, Na, and Ma are known) can be described by Eqs. fTSlfTTlfTSl 
[T9l and|20] depending the information we choose to use, with 
Eq. [17] being the one that uses the most available information. 
Finally, scenario 4 can be described by Eqs. [T8][T9] andl20l with 
Eq.|20]being the one that use the mos t available informatio n. 

The resulting M estimations from lKirk & Mversl(l201 ih data 
employing this methodology are shown in Table [2] which uses 
different information from the cluster The inferred Ai varies de- 
pending on the formulae (and hence the amount of not redundant 
information) used for the inference. The best result is obtained 
by Eq. [17] but unfortunately it does not have a practical applica- 
tion {N is unknown most of the times). 

With respect to the equations that can be used in scenario 4 
(the common observational case), Eq. [20] produce a value be- 
tween the results of Eqs. [18] and [19] Also, since (m|m > ma) 
underestimates (m|m > ma) for the clusters in the given sample, 
Eq.fTSl produces lower values than Eq.[T9](see Taurus #8 as the 
opposite example). The range of inferred M values covered by 
Eq. [18] [19] and [20] only include the observed Ai value in four 
cases (Taurus #8, Cha #1 and #2, and the field stars in IC348), 
suggesting a 20% rate of success (33% if we exclude the five 
clusters with possible strong deviations from the assumed IMF). 
In addition, we do not known which equation produces the more 
reasonable value (although Eq.|20]is preferred) nor do we have 
a possible evaluation accuracy associated to each case. 

4.1.1. The probabilistic metliodology 

In the probabilistic case, pdfs are only used to describe unknown 
data, and observed data is used to define constraints over such 
unknown data, so that both types of data have different roles. 
In addition, the solution cannot be summarized in a single value, 
but as a distribution function. Although some summaries of such 
distribution (as the mean value) can be obtained analytically. 
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Table 2. Inference of M employing the d istribution-by-numbe r 
methodology in the stellar associations bv lKirk & Mversl ( l201 Ih . 
according different scenarios. 



such values do not necessarily have enough information, and 
the best method is to obtain the full distribution of possible so- 
lutions and work with it. We propose here the methodology to 
obtain the probability distribution of Ad when we know the indi- 
vidual masses of the most massive A^a stars, and we know that all 
stars equal to or more massive than mf' are included in the A^a 
set. The problem cannot be solved analytically since recursive 
convolutions involving power laws (such as the IMF) have no 
analytical solution. So we can only propose the following step- 
by-step procedure: 

1. Obtain the distribution of N, ^f^{N\Ns), which can be in- 
ferred from the data using Eq\T3\ We stress again that an 
assumption about (S>j^(N) is required. We note that the result 
would be quite dependent on the lower limit assumed in the 
<l)yv(AO distribution. 

2. Compute the distribution of ^MM_„t,S^not-obs\Ni)for the pos- 
sible values of Ni — N — values obtained from the 
previous distribution. The distribution provides the distribu- 
tion of possible values of the total mass from the unknown 
stars, Mnot-obs, that is, Ai is actually constrained to the non- 
observed stellar masses m < mf"^, so we must use a con- 
strained IMF to describe what we do not know, <p(m\m < m^). 
Such 'I>M„„,_oi„(M„ot-obslM) distributions can be computed ei- 
ther by Monte Carlo simulations or by a numerical self- 
convolution. 

3. Compute the distribution of(S>MiAi\M^, N-^). This is done by 
weighting the previous 'I'M„„,-„b,(Mnot-obslM) distributions by 
the probabilities of each A^, value provided by <l)yv(A/|A^a) and 
including the contribution to the total mass of the observed 
stars. 

We note that these two last steps can be done by means of 
Monte Carlo simulations, which sample the discrete distribution 
(i>;^(N\Na) to obtain different A^, values, and by sampling the 
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Table 3. Inference of M em ploying probabi l istic m ethodology 
for the stellar associations by iKirk & MversI (|20I ll) in scenario 
2, using the observed value of N. 



Table 4. Inference of M em ploying p robabilistic methodology 
for the stellar associations bv iKirk & Mvers (2011) in scenario 
3, using the observed value of N, N^,, Ma and ma = 0.5Mq. 



constrained IMF with this number of stars. The previous pro- 
cedure covers scenarios 2 and 3 by applying only step 2: obtain 
(i>M(M\N) or OA((A1not-obslM) for a known N. 

We applie d this methodology to the set of clusters of 
iKirk 8c MversI (|201l') under different scenarios by means of 
Monte Carlo simulations. The distribution of solutions for each 
cluster in each scenario was sampled by 10^ Monte Carlo sim- 
ulations, and the resulting distribution was binned in intervals 
with AM - O.SMq. We note that in scenario 4 the simula- 
tions sample both the IMF and the assumed <l>7v(A0 distributions 
(power laws with fi - Q and fi - 2). Therefore, the simulations 
span a larger M range and an additional uncertainty is expected 
for the confidence interval estimations. 

Table [3] shows the resulting mean, mode, and 68.3% (equiv- 
alent to 1 cr in a Gaussian distribution) and 95.4% (equivalent to 
2cr in a Gaussian distribution) confidence intervals around the 
mode for scenario 2. As expected, the mean value of the dis- 
tribution coincides with the result of Eq(T6] shown in Table |2] 
All observed M are in the 94.5% confidence interval around the 
mode, although only 27% are in the 68.3% confidence interval, 
being the observed M larger than the range quoted in such inter- 
val. 

Table |4] shows the results of the M distribution for scenario 
3, which includes a larger amount of information. The mean and 
mode of the distribution coincides (hence the distribution is sym- 
metric), and the mean value is also coincident to the result of 
Eq. [l71 as expected. However, in this case we can evaluate how 
good this estimation actually is (and hence the distribution by 
number estimation). Taking favorable round-around cases, 17% 
of the clusters (i.e., field stars in Chal, IC348 #1, and field stars 
in IC348) are outside the Icr range, 83% are in the 2cr range, 
and 67% are in the Icr range (i.e., 12 clusters). Given the low 
number of clusters for this study, we find this result partially 
consistent with a standard methodology. However, in theory, we 
would expect only one cluster outside the Icr range, although 
we can invoke the use of a low number of clusters for this study. 
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Table 5. Inference of M employ ing probabil i stic m ethodology 
for the stellar associations by Kirk & MversI (1201 ll) . using the 
value of A^a, M.^ and m.^ = 0.5Mq and assuming a flat (^uiN) 
distribution. 



An additional outcome of this study is that, although Eq.fTTlpro- 
duces results similar to the observations, it does not necessarily 
provide a fully compatible (e.g., at Icr level) result. Again, this 
enforces the idea of using the whole pdf of possible solutions 
instead a summary (like the confidence interval range) of it. 

Tables|5]and|6]show the results of applying this methodology 
using flat and power law 0/^{N) distributions (J3 - Q and /3 - 2, 
respectively) in the range from N - N^to N = 4000 stars. The 
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Table 6. Inference of M emp loying probabil i stic m ethodology 
for the stellar associations by iKirk & MyersI (1201 Ih . using the 
value of A^a, Ma and ma - O.SMq and assuming a power-law 
<^n{N) distribution with 13-2. 



first result is that mean and mode values of the distribution are 
not equal in general, and the distribution is not symmetric, but 
j-shaped. The mode in the case of a flat <S>j^{H) distribution is 
similar to the result obtained by Eq.|20] In this case, the observed 
Ai of seven clusters are outside the 2cr confidence interval (actu- 
ally, the clusters with lower PnoriN^lN) value quoted before and 
Chal#3). If we neglect the six clusters with the larger deviations 
from the IMF, we obtain a result showing that 9% of the cluster 
are outside the 2cr interval, 91% of the cluster are in the 2cr in- 
terval, and 55% of the clusters are in the Icr interval. This is a 
reasonable result of any statistical test. 

Finally, the results of Eqs. [T8]and|20]are within the 2cr range 
in the case of a flat <t>j^{H) distribution, but the results of Eq. [19] 
(estimation from the extrapolations of the observed A^a) produce 
larger values than the upper limit of 2cr. 



5. Discussion 

We have shown in this work that the determination of clus- 
ter masses is not so trivial as supposed in the literature. The 
distribution-by-number methodology uses known data to de- 
termine unknown data, whereas the probabilistic methodology 
uses known data to constrain unknown data. The problem is 
also related to the trade-off between unknown data and prob- 
ability. When we use a pdf, like the IMF, to make inferences 
about unknown data, we implicitly renounce obtaining actual 
values of the inferred quantity. The price is to renounce preci- 
sion in favor of accuracy. In contrast, the distribution-by -number 
methodology favors precision and renounces accuracy. The dif- 
ference is in the algebra (and the logic reasoning) used in each 
of the methodologies to manipulate formulae. The distribution- 
by-number methodology uses standard algebra, where symbols 
are just mathematical expressions without added meaning. The 
probabilistic methodology follows the algebra of probability. 



which implies a clear identification of the known and the (ran- 
dom) variables we aim to describe by a probability distribution. 
As an example, the equation 

N X ^(ot > wf ^) = A?a 

provides an estimation of the number of stars with mass equal or 
larger than m°''^ in a cluster with N stars. But such an estimation 
is not necessarily a mean value nor a mode value (cf. Paper I 
for the case that A^a = !)■ In that case, we know N; hence, we 
are working with a <l*A'.,(A^a|A/') distribution. The inversion of the 
equation, that is, 

p{m > mf^) 

provides the modal value yv™'*'' of the distribution ^t^{N\Nz) 
when a flat distribution of N values is assumed, (i.e., (^/^(N) = 
constant). The distribution (^/^(N) appears naturally when the 
Bayes' theorem is used. This is a natural result when we realize 
that, since N is unknown, we need its probability distribution 
to make inferences about it, and that the "innocent" algebraic 
manipulation we have done has a completely different meaning 
than the one we would expect. 

5. 1. To ^u(N) or not to <^u(N)? 

We are now in the uncomfortable situation of having to assume a 
<t//(AO distribution in the inference of N and M. However, the 
relevance of the <i>;^{N) in the inference of M is also dependent 
on the value of m^ and A^a- In a back-of-the-envelope argument, 
the effect of a power-law (^/^(N) distribution is to decrease A^a by 
/3 stars (cf. Eq.[T4]used for W™"'''' estimation). Hence, the larger 
A^a, the lower the dependence of the Ai estimation on (t>j^{N). 
Of course, the way to increase A^a is to be complete down to the 
lowest ma possible. 

In the cases where the At inference strongly depends on our 
choice of 0/^{N), we must be guided by our knowledge of the 
physical system environment and the scientific goal of the anal- 
ysis. A flat <i>f^{N) assumes that there is no previous knowledge 
about the system environment, so it looks like good option in the 
case of isolated systems and when we are only interested in the 
system properties. 

However, the situation varies if we are interested in a clus- 
ter that we know is in a supercluster environment or is the re- 
sult of molecular cloud fragmentation. In these cases, depending 
on our knowledge and hypothesis about star formation (SF), we 
can consider that such fragmentation is the result of a high-order 
structure; hence, the particular cluster is not an isolated entity. 
This would imply that some values of TV or are more proba- 
ble than others, and this information must be taken into account 
in the inference of N and M of the particular cluster 

We must stress here that the proposed method only applies 
to (bj^(N) distributions, and not to Om(J^) ones. The case of 
<l>;v(AO is easily implemented as far as it is related to sampling 
theory and the number of the elements in the sample is the rele- 
vant quantity. The inclusion of <^m(-M) is not so trivial, since it 
depends implicitly on a <i>;^{N) distribution. However, such dis- 
tribution can not be obtained analytically (the convolution prob- 
lem is not analytic in general cases). In addition, since <i>/^(N) 
is a discrete distribution, we have a large, but finite (and hence 
computable), number of cases. This is not true for ^MiM) be- 
cause it is a continuous function and the possible solutions that 
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a combination of N stars produces a particular At is infinite. At 
this moment, the only solution is to use ^MiM) as a proxy for 
<i>/^{N), which would be valid for situations where we know a 
priori that the minimum possible number of stars is large (i.e., 
A^a is large, or we have additional information about a minimum 
number of stars in the cluster). 

Finally, the situation also changes if we are interested in 
obtaining <i>f^{N) or <l>yvi(A1) from a set of clusters. Following 
Ir^a ntola (2006), the most viable way is to make an iterative pro- 
cess. First, assume a <l);v,o(AO distribution and compute resulting 
distributions of TV, and Mi for each cluster After that, combine 
such distributions to obtain from the sample the global distri- 
butions <S>N.i(^f) and <t>M,i(M). If + ^NfiiH), then 
^Nfli-^ is not a self-consistent hypothesis. However, we must 
be aware that this does not prove that Ouj{N) and Oyvi,i(A1) 
are self-consistent hypotheses! The only way to achieve a self- 
consistent hypothesis is iterate the process until O7Vj_i(A0 - 
'^Nji'^ being the j - I distribution is the one used as input 
and the j distribution is the resulting one, along with testing if 
the resulting <i>M,j(M) distributions also obey such a condition 
(a cross validation). However, we stress again that such a cross- 
validation process is a requirement that depends on the A^a value 
and that for large enough values, the resulting (S>m{M\M^, N-^) 
solution for the Al distribution of a cluster is almost (b/^{N) in- 
dependent. 

6. Conclusions 

Throughout this work, we have explicitly developed the use of 
the IMF to obtain different physical parameters of stellar systems 
from limited information. We made extensive use of the IMF 
as a pdf, which allowed us to make proper use of probability 
theory and, in particular, the properties of sampling distributions 
(where the total number of stars in the system is included) and 
conditional probabilities. 

We studied the methodology to obtain the distribution of pos- 
sible N and Al values from the knowledge of the set of the most 
massive stars in the system. The result is dependent on the values 
of ma and A^a, and on the hypothesis about the overall distribution 
of the number of stars in clusters <!)//( AO, including the Umits of 
such distribution (especially the lower one). 
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